Search CORE

45 research outputs found

The Structure-Function Linkage Database

Author: Akiva Eyal
Almonacid Daniel E.
Babbitt Patricia C.
Barber Alan E., 2nd
Brown Shoshana
Custer Ashley F.
Ferrin Thomas E.
Hicks Michael A.
Holliday Gemma L.
Huang Conrad C.
Lauck Florian
Mashiyama Susan T.
Meng Elaine C.
Mischel David
Morris John H.
Ojha Sunil
Schnoes Alexandra M.
Stryke Doug
Yunes Jeffrey M.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/10/2013
Field of study

The Structure–Function Linkage Database (SFLD, http://sfld.rbvi.ucsf.edu/) is a manually curated classification resource describing structure–function relationships for functionally diverse enzyme superfamilies. Members of such superfamilies are diverse in their overall reactions yet share a common ancestor and some conserved active site features associated with conserved functional attributes such as a partial reaction. Thus, despite their different functions, members of these superfamilies ‘look alike’, making them easy to misannotate. To address this complexity and enable rational transfer of functional features to unknowns only for those members for which we have sufficient functional information, we subdivide superfamily members into subgroups using sequence information, and lastly into families, sets of enzymes known to catalyze the same reaction using the same mechanistic strategy. Browsing and searching options in the SFLD provide access to all of these levels. The SFLD offers manually curated as well as automatically classified superfamily sets, both accompanied by search and download options for all hierarchical levels. Additional information includes multiple sequence alignments, tab-separated files of functional and other attributes, and sequence similarity networks. The latter provide a new and intuitively powerful way to visualize functional trends mapped to the context of sequence similarity

DSpace@MIT

PubMed Central

The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families

Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

UNT Digital Library

SmCL3, a Gastrodermal Cysteine Protease of the Human Blood Fluke Schistosoma mansoni

Author: A Michel
A Semenov
A Ylonen
AG Simpson
AJ Barrett
AL Williamson
B Bjellqvist
B Turk
BE Suzek
BJ Bogitsh
C Neveu
Charles S. Craik
CL Chappell
CM Stack
Conor R. Caffrey
CP Brady
CP Brady
CR Caffrey
CR Caffrey
D Greenbaum
D Sojka
D Sojka
DG Colley
DP McManus
DT Jones
E McCarthy
EF Pettersen
Elizabeth Hansell
Eric L. Schneider
F Liu
G Guncar
H Park
HJ Atkinson
J Dvořák
J Dvořák
J Tort
James H. McKerrow
Jan Dvořák
JC Engel
JD Bendtsen
JG Beith
JH McKerrow
John Pius Dalton
JP Dalton
JP Dalton
JT Palmer
K Chlichlia
KJ Livak
KM Karrer
L Roche
M Cancela
M Delcroix
M Sajid
M Sajid
Mahmoud Bahgat
ME Dumez
Melaine Delcroix
MH Abdulla
MM Wasilewski
Mohammed Sajid
MP Jacobson
MT Stubbs
MW Robinson
N Kongkerd
P Shannon
P Steinmann
Patricia C. Babbitt
PF Basch
PJ Brindley
RH Duvall
S Rozen
S Verjovski-Almeida
SF Altschul
Simon Braschi
Susan T. Mashiyama
T Vernet
T Wex
TH Kang
TH Le
V Wippersteg
W Li
Wilson H. McKerrow
WX Tian
Y Choe
YL Li
Publication venue: Public Library of Science
Publication date: 01/06/2009
Field of study

Parasitic infection caused by blood flukes of the genus Schistosoma is a major global health problem. More than 200 million people are infected. Identifying and characterizing the constituent enzymes of the parasite's biochemical pathways should reveal opportunities for developing new therapies (i.e., vaccines, drugs). Schistosomes feed on host blood, and a number of proteolytic enzymes (proteases) contribute to this process. We have identified and characterized a new protease, SmCL3 (for Schistosoma mansoni cathepsin L3), that is found within the gut tissue of the parasite. We have employed various biochemical and molecular biological methods and sequence similarity analyses to characterize SmCL3 and obtain insights into its possible functions in the parasite, as well as its evolutionary position among cathepsin L proteases in general. SmCL3 hydrolyzes major host blood proteins (serum albumin and hemoglobin) and is expressed in parasite life stages infecting the mammalian host. Enzyme substrate specificity detected by positional scanning-synthetic combinatorial library was confirmed by molecular modeling. A sequence analysis placed SmCL3 to the cluster of other cathepsins L in accordance with previous phylogenetic analyses

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Folate Deficiency Inhibits the Proliferation of Primary Human CD8 +

Author: Bruce N. Ames
Chantal Courtemanche
Ilan Elson-Schwab
Nicole Kerry
Susan T. Mashiyama
Publication venue: 'The American Association of Immunologists'
Publication date
Field of study

Crossref

A Global Comparison of the Human and T. brucei Degradomes Gives Insights about Possible Parasite Drug Targets

Author: Conor R. Caffrey (113916)
James H. McKerrow (113918)
Kyriacos Koupparis (113915)
Patricia C. Babbitt (14119)
Susan T. Mashiyama (113914)
Publication venue
Publication date: 01/01/2012
Field of study

<div>We performed a genome-level computational study of sequence and structure similarity, the latter using crystal structures and models, of the proteases of Homo sapiens and the human parasite Trypanosoma brucei. Using sequence and structure similarity networks to summarize the results, we constructed global views that show visually the relative abundance and variety of proteases in the degradome landscapes of these two species, and provide insights into evolutionary relationships between proteases. The results also indicate how broadly these sequence sets are covered by three-dimensional structures. These views facilitate cross-species comparisons and offer clues for drug design from knowledge about the sequences and structures of potential drug targets and their homologs. Two protease groups (“M32” and “C51”) that are very different in sequence from human proteases are examined in structural detail, illustrating the application of this global approach in mining new pathogen genomes for potential drug targets. Based on our analyses, a human ACE2 inhibitor was selected for experimental testing on one of these parasite proteases, TbM32, and was shown to inhibit it. These sequence and structure data, along with interactive versions of the protein similarity networks generated in this study, are available at <a href="http://babbittlab.ucsf.edu/resources.html">http://babbittlab.ucsf.edu/resources.html</a>. </div

CiteSeerX

Directory of Open Access Journals

PubMed Central

FigShare

Structure similarity network of human and T. brucei proteases using crystal structures and models.

Author: Conor R. Caffrey (113916)
James H. McKerrow (113918)
Kyriacos Koupparis (113915)
Patricia C. Babbitt (14119)
Susan T. Mashiyama (113914)
Publication venue
Publication date
Field of study

Nodes represent experimentally characterized (crystal structure) or modeled structures and edges represent pairwise structural similarity above the structural similarity threshold (FAST SN score ≥4.5). Nodes for 342 human and 71 T. brucei are shown in the network (total of 413 nodes and 7,234 edges). The two T. brucei-specific families (TbM32 and C51) highlighted in the sequence similarity network shown in <a href="http://www.plosntds.org/article/info:doi/10.1371/journal.pntd.0001942#pntd-0001942-g001" target="_blank">Figure 1</a> are circled in red. (A) Nodes are colored by MEROPS-associated family, revealing cross-family structural relationships. Human structures are represented as circles and T. brucei as triangles. (B) The same structure similarity network as in panel A is painted by species and structure representation. Nodes are color-coded by species and node shape corresponds to type of structure representation for that sequence: square = crystal structure; triangle = ModBase model; diamond = ModWeb model. In contrast to T. brucei, there are a large number of experimentally characterized (crystal) structures for humans, but many T. brucei structures can be modeled.</p

FigShare

The T. brucei M32 protease model shows active site similarity to a human protease ACE2.

Author: Conor R. Caffrey (113916)
James H. McKerrow (113918)
Kyriacos Koupparis (113915)
Patricia C. Babbitt (14119)
Susan T. Mashiyama (113914)
Publication venue
Publication date
Field of study

The model of the T. brucei M32 protease (TbM32m, purple) is shown structurally aligned with crystal structure ACE2 (PDB code 1R4L, yellow). Depicted in ball-and-stick representation near the zinc ion are the metal binding residues and catalytic glutamate. ACE2 inhibitor MLN4760 is shown in green and ACE inhibitor lisinopril is in orange stick format (the position of which is from a structural alignment of ACE (1O86) with ACE2). The predicted steric clash of R273 in the ACE2 S1 pocket with lisinopril is marked with an arrow. The R273 CZ of ACE2 is predicted to be 1.5 Å from the lisinopril C9, so that a terminal nitrogen of R273 is in position to overlap with an oxygen of lisinopril. The arginine (R348) from TbM32m that is predicted to be close to the ACE2 R273 is also in ball-and-stick representation. The inset shows the overall structural similarity of the two proteins.</p

FigShare

Structure alignment of T. brucei C51 model (TbC51m) with a distant structure homolog, human Cathepsin F (CatF).

Author: Conor R. Caffrey (113916)
James H. McKerrow (113918)
Kyriacos Koupparis (113915)
Patricia C. Babbitt (14119)
Susan T. Mashiyama (113914)
Publication venue
Publication date
Field of study

The superposition shows these two proteins have some general, overall structural similarities, but also large differences near the active site. The TbC51 model is colored in light orange, and the human CatF is in light green. While the catalytic Cys-His dyads are closely superimposed (depicted in ball-and-stick), a striking difference is marked by an arrow indicating the predicted steric clash between the CatF vinyl sulfone inhibitor (red) and the helix of TbC51 that partially obstructs the active site.</p

FigShare

Global view of predicted active proteases of human and T. brucei showing sequence similarity relationships.

Author: Conor R. Caffrey (113916)
James H. McKerrow (113918)
Kyriacos Koupparis (113915)
Patricia C. Babbitt (14119)
Susan T. Mashiyama (113914)
Publication venue
Publication date
Field of study

Protease sequences are represented as nodes, and similarity relationships between sequences better than the threshold (BLAST E-value ≤1e-5) are depicted as “edges” or lines between nodes. In the network are represented 594 human and 127 T. brucei sequences (total of 721 nodes and 10,188 edges). (A) Distribution by family of proteases. Nodes for human sequences are represented as circles and for T. brucei sequences as triangles, and are colored by MEROPS-associated family (see <a href="http://www.plosntds.org/article/info:doi/10.1371/journal.pntd.0001942#s2" target="_blank">Methods</a>). Families of some of the larger clusters are labeled, and the parasite-specific C51 and M32 clusters are circled in red. (B) Structure coverage of sequence space is broad in human and T. brucei. The same sequence similarity network as in panel A is shown except that it is color-coded by species and nodes are enlarged and designated by different shapes to denote if a crystal structure or model exists for that sequence. Node shapes: square = crystal structure; triangle = ModBase model; diamond = ModWeb model; small circle = no structure.</p

FigShare

Distribution by catalytic type of peptidases predicted to be active in humans and T. brucei.

Author: Conor R. Caffrey (113916)
James H. McKerrow (113918)
Kyriacos Koupparis (113915)
Patricia C. Babbitt (14119)
Susan T. Mashiyama (113914)
Publication venue
Publication date
Field of study

In humans, proteases of catalytic type S (where the catalytic moiety is serine) is dominant, but metallo (type M) and cysteine (type C) peptidases are also abundant. In contrast, in T. brucei, serine peptidases are less abundant, and cysteine and metallo proteases are equally prominent. Other main catalytic types in each organism include the threonine (type T) and aspartatic (type A) proteases. Catalytic types were assigned by catalytic type designated in the family of the closest BLAST hits to MEROPS sequences.</p

FigShare

The Structure-Function Linkage Database

The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families

SmCL3, a Gastrodermal Cysteine Protease of the Human Blood Fluke Schistosoma mansoni

Folate Deficiency Inhibits the Proliferation of Primary Human CD8 +

A Global Comparison of the Human and <em>T. brucei</em> Degradomes Gives Insights about Possible Parasite Drug Targets

Structure similarity network of human and <i>T. brucei</i> proteases using crystal structures and models.

The <i>T. brucei</i> M32 protease model shows active site similarity to a human protease ACE2.

Structure alignment of <i>T. brucei</i> C51 model (TbC51m) with a distant structure homolog, human Cathepsin F (CatF).

Global view of predicted active proteases of human and <i>T. brucei</i> showing sequence similarity relationships.

Distribution by catalytic type of peptidases predicted to be active in humans and <i>T. brucei</i>.